6 research outputs found

    Methods for Addressing Data Diversity in Automatic Speech Recognition

    Get PDF
    The performance of speech recognition systems is known to degrade in mismatched conditions, where the acoustic environment and the speaker population significantly differ between the training and target test data. Performance degradation due to the mismatch is widely reported in the literature, particularly for diverse datasets. This thesis approaches the mismatch problem in diverse datasets with various strategies including data refinement, variability modelling and speech recognition model adaptation. These strategies are realised in six novel contributions. The first contribution is a data subset selection technique using likelihood ratio derived from a target test set quantifying mismatch. The second contribution is a multi-style training method using data augmentation. The existing training data is augmented using a distribution of variabilities learnt from a target dataset, resulting in a matched set. The third contribution is a new approach for genre identification in diverse media data with the aim of reducing the mismatch in an adaptation framework. The fourth contribution is a novel method which performs an unsupervised domain discovery using latent Dirichlet allocation. Since the latent domains have a high correlation with some subjective meta-data tags, such as genre labels of media data, features derived from the latent domains are successfully applied to the genre and broadcast show identification tasks. The fifth contribution extends the latent modelling technique for acoustic model adaptation, where latent-domain specific models are adapted from a base model. As the sixth contribution, an alternative adaptation approach is proposed where subspace adaptation of deep neural network acoustic models is performed using the proposed latent-domain aware training procedure. All of the proposed techniques for mismatch reduction are verified using diverse datasets. Using data selection, data augmentation and latent-domain model adaptation methods the mismatch between training and testing conditions of diverse ASR systems are reduced, resulting in more robust speech recognition systems

    Halkeilleen aukollisen jäykistävän betoniseinän voimasuureiden arviointi

    Get PDF
    Tässä opinnäytetyössä tarkastellaan korkean rakennuksen jäykistävien betoniseinien aukkopalkkien halkeilua. Insinöörityön tavoitteena oli luoda laskentapohja teräsbetonisten aukkopalkkien haljenneen tilan jäyhyysmomentin laskentaan. Laskentapohja toteutettiin Mathcad-ohjelmalla. Työn toimeksiantajana oli Wise Group Finland. Työn lähtökohtana oli aukkopalkkien tarkastelu lineaarisen kimmoisessa tilassa rakenteen haljettua. Aukkopalkkeihin kohdistuvia rasituksia selvitettiin FEM-mallien avulla käyttäen Autodesk Robot Structures 2013 -ohjelmaa. Aukkopalkkien halkeilun vaikutuksia rakennuksen voimasuureisiin, siirtymiin sekä alimpiin ominaistaajuuksiin tutkittiin FEM-mallin avulla käyttäen CSI ETABS versiota 9.7.4. Kohteeksi valittiin todellinen suunnittelukohde, jossa aukkopalkkien suuri määrä korostaa niiden jäykkyyden arvioinnin merkitystä. Tutkimuksessa kävi ilmi, että aukkopalkkien jäykkyys vaikuttaa aukollisten seinien jännitysjakaumaan. Aukollisen seinän osaseinät pyrkivät toimimaan kuten yhtenäinen seinä, kun aukkopalkin jäykkyys on suuri. Halkeilleen aukkopalkin jäykkyyteen vaikuttaa erityisesti raudoituksen määrä, sekä myös aukkopalkkiin kohdistuvien voimasuureiden suuruus. Aukkopalkin leikkaus- ja vääntömuodonmuutokset sekä niiden aiheuttama halkeilu jätettiin huomioimatta haljenneen jäykkyyden arvioinnissa. Tutkimuksen tulosten perusteella pelkästään aukkopalkkien halkeilu ei vaikuttanut huomattavasti rakennuksen siirtymiin ja dynaamisiin ominaisuuksiin. Tarkasteltavassa kohteessa jäykistävien väliseinien suuresta määrästä johtuen, pelkästään aukkopalkkien halkeilun huomioimisella ei yksinään ollut suurta merkitystä. Vaikutukset rakennuksen alimpaan ominaistaajuuteen ja siirtymiin olivat alle 4 %, kun pelkästään aukkopalkkien halkeilu huomioitiin.This graduate thesis studies the behavior of cracked lintel beams used to connect shear walls in tall buildings. The aim of this study was to develop a computer based worksheet for calculating the cracked moment of inertia for lintel beams. Mathcad 15.0 was chosen as the application for the worksheet. The basis for the studying cracked stiffness was chosen to be in the elastic range of material behavior. The forces that lintel beams are subjected to in tall buildings were studied with Autodesk Robot structural analysis calculation modeling software. The impact to deformation and the lowest mode of frequency of a tall building caused by the cracking of lintel beams was studied using FEM analysis. The building studied was a recent design project of Wise Group Finland. The application used for this analysis was CSI ETABS version 9.7.4. The research indicates that the cracking of lintel beams weaken the link of connected shear walls, and therefore also the coupling of the walls is weakened. The cracked moment of inertia for lintel beams was calculated as a function of the beam forces and reinforcement. Cracking caused by torsional and shear forces was ignored when determining the stiffness of a cracked cross-section. The results of the FEM analysis indicated that the cracking of lintel beams had a minor effect on the behavior of the building. The building that was studied had a large number of shear walls connected with lintel beams, but since the cracking of the walls was ignored, the reduced moment of inertia of link beams had little effect on the deformations and dynamic properties of the whole structure. Another reason for this was the large amount of partition shear walls in the building. The impact to lowest mode of frequency and deformations of the structure was less than 4 %, when only the cracking of lintel beams was taken in to account

    Interspeech 2016 - Experiment results for paper "webASR 2 - Improved cloud based speech technology"

    No full text
    The files in the dataset correspond to results that have been generated for the Interspeech 2016 article: "webASR 2 - Improved cloud based speech technology" <a href="https://doi.org/10.21437/Interspeech.2016-700">DOI: 10.21437/Interspeech.2016-700</a>.<br> <br> The files included are of several types:<br> - .ctm, which correspond to the output of an automatic speech recognition system.<br>- .rttm, which correspond to the output of a speaker diarisation system.<br>- .moses which correspond to the output of a machine translation system<br> - .sys, which correspond to the scoring results of the corresponding system.<br> <br>The following is a description about the naming convention of the files:<br> <br>TableX-LineY: This is the output and scoring results corresponding to Line Y of Table X in the article.<br> <br> All file types are standard outputs that are recognised by the speech technology community and can be opened using any text editor
    corecore